Scalable Telemetry Classification for Automated Malware Detection

نویسندگان

  • Jack W. Stokes
  • John C. Platt
  • Helen J. Wang
  • Joe Faulhaber
  • Jonathan Keller
  • Mady Marinescu
  • Anil Thomas
  • Marius Gheorghescu
چکیده

Industry reports and blogs have estimated the amount of malware based on known malicious files. This paper extends this analysis to the amount of unknown malware. The study is based on 26.7 million files referenced in telemetry reports from 50 million computers running commercial anti-malware (AM) products. To estimate the undetected malware, a classifier predicts the underlying nature of unknown files recorded in the telemetry reports. The telemetry classifier predicts that 69.6% (4.27 million) of the unknown files are malicious. Assuming the unknown files predicted to be malicious by the classifier are malware, the telemetry classifier also allows us to estimate the efficacy of the AM system indicating that signatures detected 82.8% (20.6 million) of the malicious files. We have validated our system by conducting a longitudinal study to measure the false positive and false negative rates over a period of thirteen months.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Discriminatory Features for Automated Malware Classification

The ever-growing malware threat in the cyber space calls for techniques that are more effective than widely deployed signature-based detection systems and more scalable than manual reverse engineering by forensic experts. To counter large volumes of malware variants, machine learning techniques have been applied recently for automated malware classification. Despite the successes made from thes...

متن کامل

An automated approach to analysis and classification of Crypto-ransomwares’ family

There is no doubt that malicious programs are one of the permanent threats to computer systems. Malicious programs distract the normal process of computer systems to apply their roguish purposes. Meanwhile, there is also a type of malware known as the ransomware that limits victims to access their computer system either by encrypting the victimchr('39')s files or by locking the system. Despite ...

متن کامل

Using File Relationships in Malware Classification

Typical malware classification methods analyze unknown files in isolation. However, this ignores valuable relationships between malware files, such as containment in a zip archive, dropping, or downloading. We present a new malware classification system based on a graph induced by file relationships, and, as a proof of concept, analyze containment relationships, for which we have much available...

متن کامل

Adaptive Semantics-Aware Malware Classification

Automatic malware classification is an essential improvement over the widely-deployed detection procedures using manual signatures or heuristics. Although there exists an abundance of methods for collecting static and behavioral malware data, there is a lack of adequate tools for analysis based on these collected features. Machine learning is a statistical solution to the automatic classificati...

متن کامل

Entrapment: Tricking Malware with Transparent, Scalable Malware Analysis

The detection of malware analysis environments has become popular and commoditized. Detection techniques previously reserved for more sophisticated forms of malware are now available to any novice cyber criminal. The use of next-generation virtualizationbased malware analysis technologies considerably reduces the number of possible transparency shortcomings, but still fails to handle pathologic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012